A Probabilistic Word Class Tagging Module Based On Surface Pattern Matching

نویسنده

  • Robert Eklund
چکیده

................................................................................................................... 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

برچسب‌گذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی

Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...

متن کامل

A Multi-phase Semi-supersense Tagging of Korean Unknown Nouns

Supersense tagging is a problem of finding a corresponding semantic super tag (eg. Phenomenon, Act) based on syntactic information and annotated corpora. However, we employ semantic information rather than syntactic one and annotated corpora, because Korean language has relatively flexible syntactic structure and is lack of annotated corpora. To construct the automatic sense tagging system for ...

متن کامل

A Morpheme-based Part-of-Speech Tagger for Chinese

This paper presents a morpheme-based part-of-speech tagger for Chinese. It consists of two main components, namely a morpheme segmenter to segment each word in a sentence into a sequence of morphemes, based on forward maximum matching, and a lexical tagger to label each morpheme with a proper tag indicating its position pattern in forming a word of a specific class, based on lexicalized hidden ...

متن کامل

Automatic Construction of a Chinese Electronic Dictionary

In this paper, an unsupervised approach for constructing a large-scale Chinese electronic dictionary is surveyed. The main purpose is to enable cheap and quick acquisition of a large-scale dictionary from a large untagged text corpus with the aid of the information in a small tagged seed corpus. The basic model is based on a Viterbi reestimation technique. During the dictionary construction pro...

متن کامل

Multi-stage Annotation using Pattern-based and Statistical-based Techniques for Automatic Thai Annotated Corpus Construction

An automated or semi-automated annotation is a practical solution towards largescale corpus construction. However, special characteristics of Thai language, such as lack of word-boundary and sentenceboundary markers trigger several issues in automatic corpus annotation. This paper presents a multi-stage annotation framework, containing two stages of chunking and three stages of tagging. Two chu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993